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Abstract — In this work, a coding technique called cost con- 
strained Geometric Huffman coding (CCGhc) is developed. 
CCGhc minimizes the KuUback-Leibler distance between a 
dyadic probability mass function (pmf) and a target pmf subject 
to an afiine inequality constraint. An analytical proof is given that 
when CCGhc is applied to blocks of symbols, the optimum is 
asymptotically achieved when the blocklength goes to infinity. The 
derivation of CCGhc is motivated by the problem of encoding a 
text to a sequence of slats subject to architectural design criteria. 
For the considered architectural problem, for a blocklength of 
3, the codes found by CCGhc match the design criteria. For 
communications channels with average cost constraints, CCGhc 
can be used to efficiently find prefix-free modulation codes that 
are provably capacity achieving. 

I. Introduction 

In the near future, parts of the electrical engineering faculty 
of RWTH Aachen University will move into new buildings 
called Information and Communication Technology (ICT) 
cubes. To protect the cubes against heating up in sun light, 
the idea is to shadow the facades by placing rows of slats 
in front of them. The slats itself come in three forms, left, 
right, and middle. All slats types have a height of 1.70m. The 
widths are given by 0.18m, 0.18m, and 0.31m, respectively. 
Each 0.625m a slat is placed. See also Fig.[T]for a visualization 
of the cubes. To cover all eight facades of the two cubes, a 
total number of 4264 slats is required. The actual choice of 
slats is subject to the following design criteria. 

CI. For aesthetic reasons, the sequence of slats should 

appear random. 
C2. To ensure enough cooling, around 33% of the facade 

area should be covered by the slats. 
C3. Since shadow turns the rooms dark, the total shadowing 

should not exceed 33%. 
Observing that many different sequences of slats fulfill the 
above constraints, Mr Mathar came up with the idea to encode 
a text to the sequence of slats, when read row by row from left 
to right. Thus, the challenge is to encode a text to a sequence 
of slats subject to the design criteria CI., C2, and C3. 

This work has been supported by the UMIC Research Center, RWTH 
Aachen University. 




Fig. 1. Visualization of the ICT cubes. 



In the remainder of this work, we develop a coding scheme 
that solves this problem. The key part of our scheme is a new 
algorithm that we call cost constrained geometric Huffman 
coding (CCGhc). This algorithm minimizes the Kullback- 
Leibler distance between a dyadic probability mass function 
(pmf) and a target pmf subject to an affine inequality con- 
straint. Interestingly, in the context of channel matching Q, 
CCGhc can also be used to directly find capacity-achieving 
modulation codes for communication channels with average 
power constraint. This improves upon the broad search ap- 
proach that we presented in |2|. 

II. Approach 

A. Problem Modelling 

As stated in the introduction, there are three types of slats, 
i.e., left, right, and middle slats. We index them in this order 
To turn the design problem into a tractable problem, we use 
a probabilistic model. Assume each slat is drawn independent 
and identically distributed (iid) from the set {1, 2, 3} according 
to a pmf p — {pi,p2,P3)'^ ■ According to criterion CI., we 
would ideally choose uniformly among the three types of slats. 
Thus, we would like the pmf p to be close to the uniform target 
pmf t = (1/3, 1/3, 1/3)-^". As a distance measure, we use the 



Kullback-Leibler (KL) distance, which is defined as 

D(p||t) = Vpaog?^. (1) 



shannon.. 



Thus, criterion CI. can be cast into the objective to minimize 
D(p||t). The width of the slats in meters is 



w = {wi,W2,w:if = (0.18, 0.18, 0.31)^ [m] 



(2) 



Every 0.625m, a slat is placed. Thus, by criterion C2., each 
slat has to cover in the average a breadth of 



S = 33% • 0.625 = 0.2063. 



(3) 



Note that u;'^t/0.625 « 36%, i.e., when using the uniform 
distribution, the shadowing is too strong and criterion C3. is 
violated. Thus, criterion C2. and C3. can be cast into the affine 
inequality constraint w^p < S. 

Pmfs of the slats can be generated as follows. We do 
source-channel separation with a binary interface, i.e., we first 
compress the text to a binary sequence, and we then design a 
code that maps the binary sequence to a sequence of slats with 
the objective to match the design criteria. The text compression 
part is a well-studied topic. For now, we therefore assume 
perfect compression, i.e., after text compression, we have a 
stream of iid equiprobable bits. By parsing the binary stream 
by a full prefix-free code, we can generate dyadic pmfs d |l]> 
i.e., pmfs where each entry d is of the form 



(4) 



Thus, our objective is to approximate the target pmf t by a 
dyadic pmf d while guaranteeing in the average a shadowing 
of at most S. Within the probabilistic model, the criteria Cl.- 
C3. can now be cast into the following optimization problem. 



minimize 

d 



B{d\\t) 

subject to w^d < S 

d is a dyadic pmf. 



(5) 



B. Cost Constrained Geometric Huffman Coding 

Without the restriction of pmfs to be dyadic, problem (|5]l is 
a convex optimization problem and can be solved efficiently. 
However, the restriction to dyadic pmfs makes the set of 
argument p discrete and the problem is not convex anymore. 
To the best of our knowledge, there is no efficient algorithm 
known that directly solves the problem. We therefore write 
the problem as a trade-off problem by adding a scaled version 
Xw^d of the shadowing to the objective function. This can 
be written as 

t 



T){d\\t) + Xw^d = ^ daog — + Xw'^d 

I 

= D(dpo2-^'"). 



(6) 



(7) 



The solution can efficiently be found by geometric Huffinan 
coding (Ghc), i.e., d = GHC(t o 2~^'^). See |1| for the 
definition of Ghc and (|3j for an implementation in Matlab. 
The shadowing constraint can be guaranteed by iteratively 



Huffman 



111000.. 



ccGhc 



rUmrm.. 



Fig. 2. We first compress the text to a binary sequence and then match the 
binary sequence to the design criteria by using CCGhc. 



adapting A: if for the resulting d, w^d > S, increase A and 
repeat, if w^d < S, decrease A and repeat. Thus, the solution 
can be found by bisection. In summary, we have the following 
algorithm, which we call cost constrained geometric Huffman 
coding (CCGhc). 



Algorithm I.(ccGhc) 



^<x* <u 

repeat 

1. A = ^ 

2. d^ GHc(t o 2" 

3. if w'^d <S,u 
until u — i < e 

A* = u 

d = GHC(to2-^*™; 



A; else £ -s— A 



C. Asymptotic Achievability 

To evaluate the quality of the dyadic pmf found by CCGhc, 
we compare to what can be achieved when dropping the 
restriction to dyadic pmfs, i.e., when allowing the argument 
p in problem (|5| to be any pmf from the probability simplex. 
Denote the optimal pmf from the probability simplex by p* . 
Since there is only a finite number of dyadic pmfs of a given 
length, the performance of the dyadic pmf d found by CCGhc 
may be too bad compared to what is achieved by the optimal 
pmf p* . This problem can be solved by generating dyadic 
pmfs of blocks of symbols. Consider the target pmf of k 
consecutive symbols. The corresponding shadowing is given 
by the Kronecker sum Vk = w®^ of k copies of w. For an 
increasing blocklength, we have the following result. 

Proposition 1. Define dk = CCGuc{t'' ,Vk, kS). Then, 



and ^ ^ 



•D(p11t) 



vldk 



< S 



(8) 
(9) 



k ' k 

i.e., the distance from the target pmf per symbol converges 
to the optimal value and the average shadowing per symbol 
converges to the target shadowing S, while the shadowing 
constraint is always fulfilled. 

Proof: The proof is given in Section |IV] ■ 

III. Writing to the ICT Cubes 

We now apply CCGhc to solve the design problem of 
finding an encoding scheme subject to the design criteria Cl.- 
C3. as stated in the introduction. The text that we write to the 
facades of the ICT cubes consists of quotes from scientists that 
significantly contributed to the development of information 
and communications technology. Our coding scheme consists 



TABLE I 

The employed Huffman code. 



_ : 000 a : 0100 

i : 0101 j : 001111111 

r : 1010 s : 1110 



b : 101110 
k : 00111101 
t : 0010 



c : 01101 
1 : 01100 
u : 10011 



d : 11110 
m : 10110 
V : 00111100 



e : 110 
n : 1000 
w : 101111 



f : 1 1 1 1 1 
o : 0111 
X : 001111100 



g : 001110 
p : 100101 
y : 100100 



h : 00110 

q : 001111110 

z : 001111101 



TABLE II 

The matching code induced by ds = ccGHc(t^, 113, 35). 



0010 : m 1101 : Ur 
1110 : rll 1001 : rlr 
01000 : mil 01011 : mlr 



00000 : 11m 
01100 : rim 
001101 : mlm 



1100 : Irl 
1000 : rrl 
01010 : mrl 



1111 : In- 
1011 : rrr 
1010 : mrr 



00011 : Irm 
01111 : rrm 
001100 : mrm 



00010 : 1ml 
OHIO : rml 
001111 : mml 



01101 : Imr 
01001 : rmr 
001110 : mmr 



0000111 : 1mm 
000010 : rmm 
0000110 : mmm 



1 1 m r m r 1 r 1 m 1 1 r m 1 



1 1 1 m m 1 1 1 1 



r m 1 



1 1 m m 



Fig. 3. Decoding the top floor with the codes specified in Table ^ and Table [T] results in shannon the fu. This is the first 
part of shannon the fundamental problem of communication is that of reproducing at one point either exactly or 
approximately a message selected at another point, a phrase taken from the first chapter of [4]. 



of two parts. We first compress the text to a binary sequence 
by Huffman coding, and then match the binary sequence to the 
design criteria by using CCGhc. See Fig.[2]for an illustration. 

A. Text Compression 

To keep the number of symbols small, we write the text 
using only small Latin characters and space, which results 
in an alphabet size of 27. To map the text to a binary 
sequence, we use the Huffman code |5 1 of the relative symbol 
frequencies in the text. See Tab.|l]for the resulting code. 49.4% 
of the bits in the resulting binary sequence are zeros and 
50.6% are ones, so roughly speaking, our assumption to have 
an iid sequence of equiprobable bits at the binary interface is 
reasonable. 

B. Criteria Matching 

We now map the binary sequence blockwise to a sequence 
of slats. The objective is to match the design criteria C1.-C3. 
as stated in the introduction. To see how close we are to the 
optimum, we calculate the optimal pmf p* when the restriction 
to dyadic pmfs is dropped. The optimal pmf is given by 

p* = (0.3988, 0.3988, 0.2023)^. (10) 

This is the pmf closest to the uniform pmf, thus the best match 
of criterion CI., while fulfilling the shadowing constraints C2. 
and C3. 

We choose A; = 3 as blocklength for the matcher codes. As 
a first matcher code, we use the code induced by the dyadic 
pmf ds = CCGHC(t'^, v^, 35). The resulting code is displayed 
in Table |ll] The first row of the resulting sequence of slats is 
displayed in Fig. [3] The interested reader is invited to decode it 
by using first the matching code in Table [HI in inverse direction 
and then the Huffman code in Table U in inverse direction. The 



effective relative frequencies of the slats is 

Pes = ^(H{left}, tt{right}, tt{middle})^ (11) 

= (0.3838, 0.39457, 0.22162)'^. (12) 

As we can see, Pcff is very close to p*. The effective 
shadowing is 

S'cff = 0.20881, (13) 

which corresponds to an average shadowing of 33.4%. This 
exceeds the target percentage of 33% by 0.4 percentage points, 
thus violates criterion C3. This problem can be fixed as 
follows. We use a stricter shadowing constraint S' = 0.206 
instead of the original target constraint S = 0.2063 and 
calculate d'^ = ccGHc(t^, 113, 35"). The effective relative 
slats frequencies that result from the code induced by dg are 
now 

p'^g = (0.39132, 0.4317, 0.17698)"^. (14) 

and the effective shadowing is 

5^ft = 0.20301. (15) 

This corresponds to an average shadowing of 32.5%, thus 
fulfills criterion C3. Note that Pcff is closer to the uniform 
pmf than p'^^f and thus matches better criterion CI. It is now 
up to the architects to choose among code ^3 and dg, i.e., to 
find the best trade-off between criterion CI. and the criteria 
C2. and C3. for their purpose. 

IV. Analysis of ccGhc 

This section consist of two parts. In Subsection A, we derive 
two lemmas that characterize the operating point geometry 
in terms of average cost and distance to the target pmf. We 
then use these two lemmas in Subsection B to actually prove 
Proposition 1. 



A. Operating Point Geometry 

We start by characterizing the region of achievable operating 
points. We define the distance-cost function D{E) pointwise 
by the solution of 



minimize D(p||t) 
p 

subject to w^p ~ E <0 
-p<0 
l^p -1 = 



(16) 



i.e., if p* is the optimal pmf for E = E*, then D{E*) = 
D(p*||i). Note that the two last constraints restrict p to the 
probability simplex, i.e., ensure that p is a pmf By the 
convention logO = —oo, clearly, whenever = 0, the 
optimal pmf assigns p* — 0, since otherwise, the objective 
function would take the value infinity. Therefore, without loss 
of generality, we assume in the following that ti > for all 
i. The Lagrangian is 

L{p, A, M, = D(p||t) + X{w'^p ~E)- (jL^p + i^il^p - 1). 

(17) 

Assume p is feasible. Then the KKT conditions are 

A>0, /x>0 (18) 
X{w^p~ E) (19) 
fi,p, = (20) 

1 - + + Am, = (21) 



dL{p,fi,iy,X) ^ 

dpi U 



It can be shown that for Problem ( [T6| ), a pmf p is optimal 
if and only if there are A, ju, such that p fulfills the KKT 
conditions. Denote now by p*,X,n,i' values that fulfill the 
KKT conditions. By the last condition. 



logp* = log<,j + 1 + /ij - - Xwi 



(22) 



since by assumption tj > 0, the right-hand side is finite, 
therefore, pi > 0. Thus, by (|20]l, /i,; — and we conclude 



\ogp* = \ogti + 1 - 1^ - Xwi, 



i = 1, . . . , TO. 



(23) 



Lemma 1. For Wmm < E < w^t, the distance-cost function 
D(i?) is strictly convex in E. 

Proof: Denote by p* and optimal pmf for E — E* . Since 
p* is a pmf. 



Pi 



P^ 



Ue' 



(24) 
(25) 



Since by assumption E* < w^t, the average weight constraint 
is active, which impUes A > 0. Thus, by ([T9|, w^p* — E*, 
i.e. 



= /(A) = E* 



(26) 



We differentiate /(A) and get 

d/(A) 



dX 



(27) 



We now want to show that < 0. Since the denominator 

is positive, we only need to consider the numerator We have 

J2 Y.'^w^'^i - wf)Ut,e-^^'"'+^^'> (28) 



WjWi - w^)titje-^^'^^+'"^'> (29) 

(30) 



i j>i 
i j>i 

where the inequality in the last line follows since there is 
at least one pair such that Wi ^ wj and since, by 

assumption, ti > for all i. Thus, / is strictly monotoni- 
cally decreasing and thereby invertible on its image, i.e., on 
{wniinjW^t). Consequently, A = f~^{E) is strictly monoton- 
ically decreasing. By |6, Sec. 5.6.3], A = — j^"* , thus, 

AME) _ df-\E) 



dE^ dE 
which shows the strict convexity of D{E) in E. 



(31) 



Lemma 2. For a given cost constraint E*, denote by p* 
an optimal pmf. Denote by p an arbitrary pmf with the only 
restriction that pi = whenever p* = 0. Then 



D(p||t) = D{E*) - X{w^p - E*) + T){p\\p* 



(32) 



where —A is the the slope of the tangent of D in {E* , D(E*)). 
Proof: 

D(pp) = ;^paog 



.Pi 
t, 



, PzPi 

= Y,p,iogf+B{p\\p*) 

= ^Pi logp* - \ogU + D(p||p* 

i i 

We further develop the first term 

^Pr log P* ^^iPt+ PI - PI ) log P* 



(33) 

(34) 

(35) 
(36) 

(37) 
(38) 



= -H(p*) + ^(p, -p*)logPl 

i 

= - H(p*) + ^(p, -p*)(logt, + 1-V-XW,) (39) 

i 

= - H(p*) - A(u?^p - w^p*) + ^Pi logU 

i 

-Y,p*\ogU (40) 

i 

= D(p*|lt) - A(-ii;^p - w^p*) + Y^p, \ogU. (41) 



average cost E 



Fig. 4. 



E' E" E* 

Fig. 5. 



average cost E 



All together, 

D{p\\t)=T){p*\\t)-\{w'^p-w'^p*) + D{p\\p*) (42) 
= D{E*)- X{w^p- E*)+B{p\\p*). (43) 



B. Proof of Proposition [7] 

We now show that any target operating point Q* — 
{w'^p* ,D{w'^p*)) can be achieved by a dyadic pmf. We 
do this in two steps. First, we show the existence of dyadic 
operating points close to the target operating point, and then 
we show that CCGhc actually finds them. Both results are a 
direct consequence of the strict convexity of the distance-cost 
function D{E) that we stated in Lemma [T] 

1) Existence of good dyadic points: Consider the optimal 
pmf p*'^ of k consecutive symbols. Define v^. — w®'' where 
w®'^ denotes the Kronecker sum of k copies of w. Further- 
more, define dk = GHC(p*'^). By Lemma |2] the operating 
point geometry becomes 



= D{E*) - X 



vldk 



-E' 



(44) 



By IT] Prop. 2], since du = GHC(p*''), the normalized KL- 
distance on the right-hand side goes to zero as fc — oo. 
Consider now Fig. |4] The tangent of D(i?) in Q* is given 
by 



g(E) = D{E*) -X(E-E*). 



(45) 



As the normalized KL-distance of dk to p*^ gets smaller, the 
normalized KL-distance of dk to on the left-hand side of 
( |44j l is approaching the tangent g. However, because the tan- 
gent is linear in E and D is strictly convex and lower bounds 

^ ^' — -, the dyadic operatmg pomt ( '-^ , ^ ^' — -) has 
to approach Q* both in terms of distance and cost. 

2) Finding good dyadic points: It remains to show that 
algorithm CCGhc finds good dyadic points. This can best be 
seen in Fig. |5] Suppose we want to find a dyadic pmf dk such 



that for a given e > 0, 

B{dk\\t*) 



k 



< D{E*) + e and 



vTdk 



< E*. 



Define 

E' : D{E') = D{E* 



e and E" = 



E' + E* 



(46) 



(47) 



The chord from Q* = {E*,D{E*)) to Q' = {E\D{E')) 
cuts a segment from the area above D. Because of the 
strict convexity of D, this segment is nonempty. Note that 
all operating points in the segment fulfill the requirements 



( |46l ). As shown in the previous Subsection IV-Bl for a big 
enough k, there are dyadic operating points approximating 
Q" = (£"', D{E")) that lie within this segment. Define now 
— ^ as the slope of the chord, i.e., 

D{E') - D{E*) 



E' - E* 



Now, dk = GHC(t''' o 2 minimizes 



D{dk\\t')+^vld 



(48) 



(49) 



and will thus find a point in the segment. The slope — ^ will 
also be evaluated by cCghc, thus dk = CCGHC(t'', Vk,kE*) 
will give a dyadic operating point at least as good as 

dfc = GHc(t'^o2-«'"''). (50) 

This concludes the proof of Proposition 1 . 

References 

[1] G. Bocherer and R. Mathar, "Matching dyadic distributions to cliannels," 
in Pruc. Data Compression Conf., 2011. 

[2] G. Boclierer, F. Altenbach, and R. Mathar, "Capacity achieving modu- 
lation for fixed constellations with average power constraint," in Proc. 
IEEE Int. Conf. Commun. (ICC), 2011. 

[3] G. Bocherer, "Geometric huffman coding," |http://www.georg-boechereE] 
de/ghc Dec. 2010. 

[4] C. E. Shannon, "A mathematical theory of communication," Bell Svst. 

Tech. /, vol. 27, pp. 379^23 and 623-656, Jul. and Oct. 1948. 
[5] D. A. Huffman, "A method for the construction of minimum-redundancy 

codes," Proc. IRE, vol. 40, no. 9, pp. 1098-1101, Sep. 1952. 
[6] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge 

University Press, 2004. 



